Maps are popular means to visualise geospatial data. Companies often possess datapoints with latitute and longitude; nonetheless, data analysts typically lack skills to visualise this data as a map. This talk explains how to set up a map with your custom data in a web browser. We are going to talk about library folium that takes care of renderring the map, and how to feed your custom data into it.
☢ nuclear physicist
⨋ data scientist
✉ linkedin.com/in/vojtech-filipec/
㏚ github.com/vojtech-filipec/
| ID | latitude | longitude | object name | # users | (other properties) |
|---|---|---|---|---|---|
| 0 | 17.269668 | 48.284123 | Malokarpatská knižnica v Pezinku | 980 | ... |
| 1 | 17.256186 | 48.187381 | Obecná knižnica Ivanka pri Dunaji | 1100 | ... |
| 2 | 22.146649 | 48.985100 | Mestská knižnica Snina | 1500 | ... |
| 3 | 19.668956 | 48.328798 | Novohradská knižnica | 754 | ... |
| 4 | 18.272445 | 47.868790 | Obecná knižnica Dulovce | 222 | ... |
| ... | ... | ... | ... | ... | ... |
| high N | 21.372445 | 48.068790 | Yet another library | 883 | ... |
To visualize this:
POIs = Points Of Interest = customer addresses, retail branches, trip destinations, ...
0. Preparations: download all libraries in Slovakia
1. Use-case #1: Plot a set of POIs
matplotlibfoliumfolium2. Use-case #2: Chorophlet map
h3 + foliumgoal: use The OpenStreetMap to download all libraries in Slovakia
import requests
import overpy
api = overpy.Overpass()
qry_libraries = """
(area["ISO3166-1"="SK"];) ->.slovakia;
nwr["amenity"="library"](area.slovakia);
out tags center;
"""
res = api.query(qry_libraries)
details: refer to Appendix
display(df_libs.head())
display(df_libs.shape)
| object_id | object_type | lon | lat | library_name | tags | cnt_tags | |
|---|---|---|---|---|---|---|---|
| 0 | 36154951 | node | 17.269668 | 48.284123 | Malokarpatská knižnica v Pezinku | {'addr:city': 'Pezinok', 'addr:housenumber': '... | 10 |
| 1 | 54583383 | node | 17.256186 | 48.187381 | Obecná knižnica Ivanka pri Dunaji | {'addr:city': 'Ivanka pri Dunaji', 'addr:house... | 10 |
| 2 | 191874699 | node | 22.146649 | 48.985100 | Mestská knižnica Snina | {'addr:city': 'Snina', 'addr:postcode': '069 0... | 5 |
| 3 | 256056268 | node | 19.668956 | 48.328798 | Novohradská knižnica | {'addr:city': 'Lučenec', 'addr:conscriptionnum... | 12 |
| 4 | 258101743 | node | 18.272445 | 47.868790 | Obecná knižnica Dulovce | {'addr:city': 'Dulovce', 'addr:housenumber': '... | 9 |
(1154, 7)
object_type and object_id define the URL in The OpenStreetMap: e.g. https://www.openstreetmap.org/node/54583383
matplotlibfoliumfoliummatplotlib scatterplot by lat/long¶fig, ax = plt.subplots(figsize=(16, 10))
ax.set_xlabel('longitude')
ax.set_ylabel('latitude')
plt.scatter(df_libs['lon'], df_libs['lat'])
plt.show()
display(first_map)
import folium
m = folium.Map(location=[df_libs.lat.mean(), df_libs.lon.mean()], zoom_start=8, tiles='OpenStreetMap')
display(m)
m = folium.Map(location=[df_libs.lat.mean(), df_libs.lon.mean()], zoom_start=8, tiles='OpenStreetMap')
for row in df_libs.itertuples():
folium.CircleMarker( # marker name
location=[row.lat, row.lon], # marker location, mind the order: [Y, X], ie [latitude, longitude]
radius=6,
fill=True
).add_to(m)
display(m)
def create_color(row):
if row.cnt_tags < 5:
color = '#ffeda0' # '#FFEDA0' works too, yellow
elif row.cnt_tags < 7:
color = '#fed976'
elif row.cnt_tags < 9:
color = '#feb24c'
elif row.cnt_tags < 11:
color = '#fd8d3c'
elif row.cnt_tags < 13:
color = '#fc4e2a'
else:
color = '#bd0026' #darkred
return color
for row in df_libs.itertuples():
folium.CircleMarker(
location=[row.lat, row.lon],
radius=6,
fill=True,
color=create_color(row) # COLOR
).add_to(m)
display(m)
Generate your color scale: https://hihayk.github.io/scale
def create_popup(row):
return folium.Popup(folium.IFrame("""
{name}
<br>
# tags: <b> {cnt} </b>
<br>
<a href="https://www.openstreetmap.org/{object_type}/{object_id}">link to OSM detail</a>
""".format(name= row.library_name,
cnt = row.cnt_tags,
object_type = row.object_type,
object_id = row.object_id),
width=450, height=110))
m = folium.Map(location=[df_libs.lat.mean(), df_libs.lon.mean()], zoom_start=8, tiles='OpenStreetMap')
for row in df_libs.itertuples():
folium.CircleMarker(
location=[row.lat, row.lon],
radius=6,
fill=True,
color=create_color(row),
popup=create_popup(row) # POPUP
).add_to(m)
display(m)
folium.Map(tiles= ...)
show_100_libraries('CartoDB positron')
show_100_libraries('Stamen Toner')
show_100_libraries('Stamen Terrain')
Marker vs. CircleMarker¶m = folium.Map(location=[df_libs.lat.mean(), df_libs.lon.mean()], zoom_start=8, tiles='OpenStreetMap')
for row in df_libs.itertuples():
folium.Marker(
location=[row.lat, row.lon],
icon=folium.Icon(icon='book') # <---- specify the icon
).add_to(m)
display(m)
from folium.plugins import MarkerCluster
m = folium.Map(location=[df_libs.lat.mean(), df_libs.lon.mean()], zoom_start=9, tiles='OpenStreetMap')
marker_cluster = MarkerCluster(control=False).add_to(m) # <<<<<
for row in df_libs.itertuples():
folium.CircleMarker(
location=[row.lat, row.lon],
radius=7,
fill=True,
color=create_color(row),
popup=create_popup(row)
).add_to(marker_cluster) # <<<<<
display(m)
h3 + foliumchorophlet map ("kartogram" in CZ/SK) uses a color to visualize the aggregated summary of points within each area
Unlike plotting single POIs, a chorophlet map ("kartogram" in CZ/SK) visualizes property of an area:
Folium offers chorophlet maps:
extension: Uber's h3
h3¶# source: https://h3geo.org/docs/core-library/restable
u3_hexagons
| area_km2 | edge_km | cnt_indexes | |
|---|---|---|---|
| resolution | |||
| 0 | 4.250547e+06 | 1107.712591 | 122 |
| 1 | 6.072210e+05 | 418.676005 | 842 |
| 2 | 8.674585e+04 | 158.244656 | 5882 |
| 3 | 1.239226e+04 | 59.810858 | 41162 |
| 4 | 1.770324e+03 | 22.606379 | 288122 |
| 5 | 2.529034e+02 | 8.544408 | 2016842 |
| 6 | 3.612905e+01 | 3.229483 | 14117882 |
| 7 | 5.161293e+00 | 1.220630 | 98825162 |
| 8 | 7.373276e-01 | 0.461355 | 691776122 |
| 9 | 1.053325e-01 | 0.174376 | 4842432842 |
| 10 | 1.504750e-02 | 0.065908 | 33897029882 |
| 11 | 2.149600e-03 | 0.024911 | 237279209162 |
| 12 | 3.071000e-04 | 0.009416 | 1660954464122 |
| 13 | 4.390000e-05 | 0.003560 | 11626681248842 |
| 14 | 6.300000e-06 | 0.001349 | 81386768741882 |
| 15 | 9.000000e-07 | 0.000510 | 569707381193162 |
h3.geo_to_h3(latitude,longitude,resolution)h3.h3_to_geo(hexagon_ID)display(df_libs_g.head())
display(df_libs_g.shape)
| hex5 | cnt_libraries | lat | lon | |
|---|---|---|---|---|
| 0 | 851e0003fffffff | 15 | 48.758350 | 18.303044 |
| 1 | 851e0007fffffff | 13 | 48.762389 | 18.541690 |
| 2 | 851e000bfffffff | 8 | 48.628137 | 18.184351 |
| 3 | 851e000ffffffff | 11 | 48.632401 | 18.422525 |
| 4 | 851e0013fffffff | 14 | 48.883941 | 18.183187 |
(204, 4)
colours and pop-ups:
def create_color_chorophlet(row):
if row.cnt_libraries < 2:
color = '#0A6600'
elif row.cnt_libraries < 4:
color = '#419900'
elif row.cnt_libraries < 5:
color = '#94cc00'
elif row.cnt_libraries < 6:
color = '#ffff00'
elif row.cnt_libraries < 9:
color = '#ffac1a'
elif row.cnt_libraries < 12:
color = '#ff6133'
else:
color = '#ff4d54'
return color
def create_popup_chorophlet(row):
return Popup(IFrame("""
# libraries: <b> {cnt} </b>
<br>
hexagon center: {lat} / {lon}
""".format(lat='{:2.4f}'.format(row.lat),
lon='{:2.4f}'.format(row.lon),
cnt = row.cnt_libraries),
width=300, height=70))
m = folium.Map(location=[df_libs_g.lat.mean(), df_libs_g.lon.mean()], zoom_start=7, tiles='CartoDB positron')
for i in df_libs_g.index:
row = df_libs_g.loc[i]
m = visualize_hexagons([row[hex_col]], color=create_color_chorophlet(row), folium_map=m)
display(m)
# source: https://nbviewer.jupyter.org/github/uber/h3-py-notebooks/blob/master/notebooks/usage.ipynb
refer to my PyConCZ 2020 talk: https://github.com/vojtech-filipec/PyConCZ-OSM-API
It seems the OSM contains only a subset of all libraries, as this source https://www.infolib.sk/sk/kniznice/adresare/zoznam-kniznic-sr/ describes more than 3000 libraries.
We skipped this: a library geopandas follows the standard pandas API while offering useful operations with geometries.
A nice overview in Towards Data Science: https://towardsdatascience.com/the-battle-of-interactive-geographic-visualization-part-5-folium-cc2213d29a7